This link was included with the announcement you received on Sunday.
I will use the survey results to design the class.
Course Overview
What do we mean by “data science?”
We will learn and practice a series of methods for organizing, collecting, visualizing, manipulating, and exploring different kinds of data.
We focus on the creation of data and application of methods, not theoretical or foundational questions.
This is not a mathematics course, nor will it resemble a traditional introductory statistics class. We will spend the entire semester writing code to apply data science concepts.
Why do data science? One perspective:
There is too much information in the world.
e.g., every minute, approximately 500 hours of video are uploaded to YouTube.
People value useful information and new knowledge.
Almost none of those 500 hours are worth your finite time.
Data science transforms data into useful information.
How do people do data science?
R and Python are the two most popular programming languages for data science.
We will be using R in this class.
However, the main learning goal of this class is not R.
The main learning goal is to understand:
what good data is
how to ask good questions of data
how we can use good data to answer good questions
What does the data science process look like?
flowchart LR
A[Import] --> B[Tidy]
B --> C[Transform]
subgraph Understand
direction LR
C --> D[Visualize]
D --> E[Model]
E --> C
end
Understand --> F[Communicate]
subgraph Program
direction LR
A
B
Understand
F
end